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ABSTRACT 

Interactive  digital  processing  of  optical  data  is  shown  to  be  useful 
in  investigations  of  pattern  recognition  techniques.  A  system  including 
a  flying  spot  scanner  and  graphics  terminal  for  the  optical  to  digital 
conversion  and  subsequent  display  of  two-dimensional  images  is  described. 
Application  of  such  a  system  is  demonstrated  by  interactive  contour 
tracing  and  character  recognition  by  template  matching.   Algorithms 
necessary  for  the  applications  are  developed  and  tested. 
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I.   INTRODUCTION 

The  purpose  of  this  investigation  was  to  determine  if  interactive 
optical  to  digital  conversion  could  be  usefully  employed  in  image  pro- 
cessing. An  interactive  system  involving  simple  hybrid  and  graphical 
display  techniques  to  process  two-dimensional  images  is  described.   Ap- 
plication of  this  system  is  demonstrated  by  interactive  contour  tracing 
and  character  recognition  by  template  matching. 

Optical  character  recognition  (OCR)  and  image  processing  equipments 
have  long  been  envisioned  as  the  ultimate  devices  for  data  input.   Despite 
many  recent  advances  in  the  interpretation  of  pictorial  data,  there  have 
not  yet  been  developed  machines  which  are  capable  of  competing  with  the 
human  eye.   The  amount  of  information  contained  in  even  a  simple  sketch 
or  photograph  requires  a  comparatively  large  amount  of  descriptive  lan- 
guage, either  written  or  spoken.   It  would  therefore  seem  very  worthwhile 
to  direct  research  efforts  towards  picture  processing  by  modern  high-speed 
computers . 

Perhaps  the  most  immediate  beneficiaries  of  OCR  equipment  are  paper- 
work oriented  businesses.   Keypunch  operators  are  generally  higher  salaried 
than  typists  and  also  usually  make  more  errors  £l],  [2].   Since  nearly  • 
all  data  input  to  an  office  computer  must  first  be  written  or  typed, 
then  translated  to  punched  cards,  how  much  simpler  it  would  be  to  go 
directly  from  the  written  word  to  the  computer  memory.   However,  while 
many  types  of  equipment  have  been  developed  to  perform  optical  scanning 


and  subsequent  digital  processing  there  remains  the  problem  of  efficient 
input  of  pictorial  data;  not  only  is  batch  processing  difficult  due  to 
the  wide  variety  of  document  types  and  sizes,  but  also  such  problems  as 
differences  in  simple  character  forms,  e.g.  script  and  printed,  must  be 
taken  into  consideration. 

The  types  of  applications  which  could  benefit  from  digital  processing 
of  images  are  many  and  diverse.  Monitoring  of  continuous  graphical  or 
photographic  displays  such  as  satellite  photo-reconnaissance  and  satellite 
meteorological  information  could  be  performed  by  image  processing  computers 
[3],  Batch  processing  of  medical  laboratory  data  of  the  nature  of  blood 
samples,  bacteria  cultures,  and  X-rays  could  reduce  the  costs  of  such 
analysis  and  help  relieve  the  shortage  of  personnel  trained  to  perform 
these  tasks  [4]  . 


II.   DESCRIPTION  OF  AN  OPTICAL  SCANNING  DEVICE 

This  investigation  involved  the  interactive  use  of  an  optical  scanner 
constructed  for  experimental  use,  a  COMCOR  Ci-5000  analog  computer,  an 
XDS-9300  digital  computer,  and  an  ADAGE  AGT-10  graphics  terminal.   A 
description  and  discussion  of  the  interactive  system  is  contained  in 
Section  III.  The  objective  was  to  control  the  scanner  and  subsequent 
digital  processing  from  a  graphics  display  console. 

The  term  "optical  scanner",  in  its  correct  technical  sense,  means  a 
device  whose  function  is  the  conversion  of  an  image  into  electrical  sig- 
nals.  However,  current  usage  of _ the  term  often  includes: the  associated 
hardware  and  software  components  which  form  a  character  or  pattern  recog- 
nition system  [1].   The  first  definition  will  be  used  here  as  it  is  more 
precise  and  may  prevent  possible  confusion  with  other  system  components. 

There  are  many  different  types  of  optical  scanners  in  use  today. 
Most  are  based  on  the  use  of  a  cathode-ray  tube,  a  revolving  disk,  or  an 
oscillating  mirror  [l],  [5].   The  cathode-ray,  or  flying  spot  scanner, 
was  the  type  used  for  this  investigation  and  is  widely  used  in  the  image 
processing  field.   Perhaps  the  most  well  known  of  these  is  the  one  used 
in  the  Postal  Service's  OCR  equipment,  which  locates  and  classifies  zip- 
codes  within  the  address  portion  of  the  envelope  [l],  [6].   A  simplified 
block  diagram  of  a  flying  spot  scanner  is  shown  in  Figure  1. 

The  operating  principles  of  the  flying  spot  scanner  are  not  compli- 
cated and  are  included  here  only  as  background  material  for  the  under- 
standing of  the  interactive  processing  system.   As  shown  in  Figure  1 
a  spot  on  the  face  of  the  cathode-ray  tube  (CRT)  provides  a  light  beam 
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which  is  focused  through  a  lens  and  then  split  into  two  beams.   One  of 
these  light  beams  passes  through  a  transparency  of  the  image  to  be  pro- 
cessed and  then  falls  on  the  light-sensitive  portion  of  a  photo-multiplier 
tube.   The  photo-tube  transmits  a  voltage,  whose  value  depends  on  the 
intensity  of  light  received  from  the  beam,  through  amplifier  circuitry 
to  the  output  line;  this  voltage  is  called  the  transmitted  light.   If, 
for  example,  the  transparency  contains  only  vertical  black  lines  evenly 
spaced  with  transparent  areas,  then  whenever  the  light  beam  is  positioned 
behind  a  black  line  little  or  no  light  passes  through  to  the  photo-tube, 
hence  a  minimum  or  zero  transmitted  voltage.   When  the  light  beam  passes 
through  a  transparent  area  a  maximum  voltage  is  transmitted.   This  sim- 
ple example  illustrates  the  binary  case,  meaning  only  black  (opaque)  and 
white  (transparent)  lines  are  used  to  form  the  image  on  the  transparency. 
If  the  transparency  contained  continuous  shades  of  gray  between  the  black 
and  white  extremes,  such  as  in  a  black  and  white  photograph,  then  a 
continuous  range  of  voltage  levels  between  the  maximum  and  minimum 
levels  could  exist.   This  continuum  of  possible  voltage  levels  could 
then  be  quantized  into  a  desired  number  of  gray  levels. 

The  second  light  beam  produced  by  the  beam  splitter  in  Figure  1 
passes  through  a  lens,  falls  on  a  photo-tube,  is  amplified,  and  is  out- 
put on  the  line  called  reference  light.   This  reference  light  value  is 
simply  a  measure  of  the  intensity  of  the  light  beam  fron  the  CRT  spot  and 
therefore  can  be  used  as  a  reference  in  determining  the  shades  of  gray 
levels  in  the  transmitted  light.   The  reference  light  could  also  be  used 
as  a  correction  factor  for  any  anomalies  in  the  scanner,  such  as  component 
misalignment  or  CRT  spot  intensity  variation  at  different  points  on  the 
CRT  face. 
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The  conversion  of  the  image  on  the  transparency  into  quantized  data 
suitable  for  input  to  a  digital  processing  unit  becomes  simply  a  matter 
of  driving  the  X  and  Y  deflection  yokes  within  the  CRT  so  as  to  move  the 
spot,  and  hence  the  light  beam,  through  a  scan  pattern  across  the  trans- 
parency.  In  Figure  1  these  are  the  X  and  Y  control  signals.   The  Z 
control  signal  is  used  to  control  the  intensity  of  the  CRT  spot  so  as  to 
maximize  the  difference  between  minimum  and  maximum  transmitted  light 
signals. 

Initially,  tests  were  made  with  the  scanner  to  determine  the  quality 
of  the  gray-level  signal  transmission  and  to  provide  an  approximate 
measure  of  the  resolution  capabilities.  Resolution  is  a  relative  term 
used  in  this  application  as  the  ability  to  distinguish  between  lines 
closely  spaced.   There  are  several  possible  scanning  patterns;  the  most 
widely  used  is  the  raster  scan,  either  vertical  or  horizontal.   However, 
in  this  investigation  a  less  complicated  scanning  pattern  was  chosen  not 
only  for  the  ease  of  implementation  but  also  with  a  view  toward  subse- 
quent digital-to-analog  (DAC)  and  analog-to-digital  (ADC)  conversions 
which  would  be  applied.   Basically,  the  X  signal  voltage  starts  at  the 
operator  selected  point  and  increases  in  steps  of  dx  until  the  end 
point  is  reaches;  this  results  in  the  CRT  spot  being  driven  horizontally 
from  left  to  right.   When  the  spot  reaches  the  right-hand  end  point  the 
Y  signal  voltage  is  increased  by  dy  and  the  X  sweep  is  repeated.   A 
simplified  time  plot  is  shown  in  Figure  2(a)  and  the  resulting  scan 
pattern  in  Figure  2(b).   This  pattern  is  also  applicable  to  discussions 
in  later  sections  concerning  scanning  patterns. 

The  results  of  the  initial  tests  are  shown  in  Figures  3  and  4.   A 
Hewlett-Packard  Model  130QA  X-Y  Display  was  found  to  be  a  convenient 
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FIGURE  2.   SCANNING  PATTERN 


12 


monitoring  display  throughout  the  investigation.   For  visual  estimation 
of  the  resolution  and  gray  level  range,  it  was  necessary  to  generate 
a  fast  TV  raster-like  scan.   The  phosphor  on  the  HP-1300A  CRT  does  not 
retain  brightness  long  enough  for  the  eye  to  capture  an  image.   Figures 
3  and  4,  then,  are  not  completely  accurate  indications  of  the  scanner's 
capability  but  also  reflect  degradation  due  to  the  HP-1300A  display 
device  and  scope-photography  techniques.   It  was  determined  that  the 
scanner  is  capable  of  distinguishing  lines  less  than  0.5  millimeters 
apart  which  was  more  than  adequate  for  this  investigation. 

Examples  of  the  types  of  applications  for  which  flying  spot  scanners 
can  be  used  were  mentioned  in  Section  I.   At  this  point  in  the  inves- 
tigation it  was  determined  that  two  particular  applications  would  illus- 
trate the  potential  uses  of  the  scanner:   (1)  contour  boundary  tracing, 
and  (2)  character  recognition  by  template  matching.   The  technique  of 
contour  tracing  involves  the  control  of  the  light  beam  from  the  CRT 
spot  so  that  it  "follows"  a  given  contour  on  the  desired  image.   If 
then  the  X-Y  coordinates  of  the  CRT  spot  are  sampled  periodically  and 
used  as  digital  input  to  a  computer  then  the  original  contour  is  "digit- 
ized" or  "quantized"  and  can  be  further  processed  by  conventional 
programing  techniques.   Contour  tracing  could  be  applied  effectively  to 
many  common  areas  such  as  meteorlogical  or  topological  survey  maps. 

The  second  application  serves  to  illustrate  elementary  character 
recognition.   The  technique  of  template  matching  was  probably  the  first 
method  used  to  recognize  alpha-numeric  characters  by  machine  [7*].   The 
"template"  is  simply  a  stored- in-memory  representation  of  a  letter, 
number  or  any  other  image.   The  optical  scanner  is  used  to  scan  and 
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FIGURE    3.      HP-1300A   DISPLAY   OF   SCANNED   CONTOURS   TRANSPARENCY 


FIGURE    4.      HP-1300A    DISPLAY   OF    SCANNED  RESOLUTION  AND 


GRAY- LEVEL  TRANSPARENCY 
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quantize  an  unknown  character  and  then  the  quantized  unknown  is  matched 
against  any  given  number  of  known  templates.   If  a  predetermined  correlation 
exists  between  any  template  and  the  given  unknown  then  a  decision  is 
made  to  "recognize"  that  unknown  as  a  certain  character  L8J. 

It  was  decided  to  perform  both  applications  in  an  interactive  scanning 
control  mode  of  operation  to  determine  if  this  could  be  an  effective 
method  of  utilization  of  a  flying  spot  scanner. 
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III.   DESCRIPTION  OF  AN  INTERACTIVE  SCANNING  SYSTEM 

The  process  of  interaction  can  be  thought  of  as  two  or  more  entities 
influencing  each  other's  behavior.   Specifically,  interaction  with  a 
computer  system  means  that  a  user  can  direct  or  control  a  problem  or 
sequence  of  problems  that  a  computer  might  be  solving  at  a  given  time; 
the  amount  of  interaction  is  a  relative  measurement  which  can  range  from 
very  low  to  very  high  levels.   Figure  5  shows  a  block  diagram  of  the 
interactive  hybrid  computer  system  used  to  conduct  this  investigation. 
The  human  user  in  this  system  exercises  control  over  the  scanning, 
processing,  and  display  functions  thereby  achieving  a  reasonably  high 
degree  of  interaction. 
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FIGURE  5.   INTERACTIVE  SCANNING  SYSTEM  BLOCK  DIAGRAM 

The  flying  spot  scanner  receives  control  signals  directly  from  the 
analog  computer.   These  control  signals  are  the  X  and  Y  deflection 
voltages  necessary  to  form  a  scanning  pattern.   Intensity,  or  Z 
signal,  voltage  was  not  changed  by  the  computer  system  in  this  investigation 
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The  intensity  control  of  the  scanner  was  separately  adjustable  by  the 
operator.   The  X-Y  signals  are  generated  by  a  digital  computer  program 
and  transmitted  to  the  analog  computer  via  DACs .   The  transmitted  light 
signal  (TL)  from  the  scanner  is  transmitted  via  the  analog  patch  board 
and  ADCs  to  the  digital  computer  for  further  processing.   Sequences  of 
samples  of  the  three  voltages  X,  Y  and  TL  comprise  the  primary  variables 
within  the  digital  processor  necessary  for  subsequent  image  processing. 
The  processed  image  is  then  sent  to  the  graphics  terminal  CRT  and 
displayed  to  the  operator. 

Since  the  purpose  of  this  investigation  was  to  implement  and  analyze 
some  of  the  various  aspects  of  interactive  optical  scanning,  the  con- 
straints imposed  by  the  available  processing  hardware  were  not  considered 
detrimental  to  the  objective.   The  XDS-9300  is  a  relatively  slow-speed 
digital  computer  having  a  24-bit  word  length  and  a  memory  cycle  time  of 
1.75  microseconds.   This  particular  installation  has  32K  (32,768)  words 
of  available  storage,  much  of  which  is  occupied  by  a  real-time  monitor. 
The  DACs  use  15  bits  of  precision  at  100K  (100,000)  words  per  second 
while  the  ADCs  provide  14  bit-precision  words  at  the  same  transmission 
rate.   Auxiliary  storage  is  available  on  a  drum  or  .on  magnetic  tape, 
but  neither  of  these  devices  was  utilized  for  reasons  which  are  discussed 
in  Section  IV.   Data  transmission  from  the  graphics  terminal  to  the 
XDS-9300  digital  computer  is  at  the  rate  of  250K  words  per  second  and 
160K  words  per  second  in  reverse. 

The  ADAGE  AGT-10  Graphics  Display  is  a  digital  computer  with  8K, 
30-bit  words  of  core  storage  and  twin  disk-packs  for  auxiliary  storage. 

The  operator's  console  provides  the  operator  with  a  12  inch  square  CRT, 
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a  teletypewriter  keyboard,  16  push-button  function  switches,  6  variable 
control  dials,  a  light  pen  and  2  foot  pedals.  Neither  the  keyboard  nor 
the  foot  pedal  was  utilized  during  this  investigation. 

The  decision  to  incorporate  a  graphical  display  of  processed  images 
was  made  early  in  the  investigation.   It  was  apparent  that  in  order  to 
be  interactive,  an  image  processing  system  must  provide  the  investigator 
or  operator  with  a  visual  display.  As  the  only  other  available  display 
would  have  been  line  printer  plots,  the  advantages  of  a  CRT  display  were 
obvious.   It  was  therefore  decided  to  utilize  the  graphics  console  with 
its  associated  CRT,  function  switches  and  variable  control  dials  as  the 
operator's  station.  From  this  position  almost  all  desirable  controls 
over  the  optical  scanner,  image  processor  and  visual  display  can  be 
performed. 

The  primary  operator  functions  were  considered  to  be  (1)  selection  of 
the  area  to  be  scanned  on  the  transparency,  (2)  selection  of  the  number 
of  samples  to  be  taken,  expressed  as  a  resolution  in  both  X  and  Y 
directions,  and  (3)  selection  of  an  intensity  threshold  which  determines 
whether  a  sampled  point  is  black  or  white.   The  CRT  displays  a  box  or 
frame  which  defines  the  area  to  be  scanned.   By  adjusting  the  correspon- 
ding variable  control  dials  an  operator  can  change  the  height,  width 
and  position  of  the  "scan  box"  to  enclose  the  desired  area  within  the 
transparency  boundaries.   Other  dials  enable  the  operator  to  select  a 
desired  intensity  threshold  and  the  number  of  sample  points  in  the  X  and 
Y  directions. 

Secondary  controls  include  the  function  switches  and  light  pen.   The 
16  function  switches  enable  the  operator  to  control  the  program  flow 
from  the  graphics  display  console.   Such  functions  as  starting  the  scan, 
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magnifying  a  displayed  image,  selecting  either  the  contour  tracing  or 
character  recognition  routines  and  other  "branching"  types  of  operations 
were  considered  necessary  for  the  desired  degree  of  interaction.   The 
light  pen  used  with  the  AGT-10  CRT  only  detects  the  presence  of  light 
on  the  scope  face;  the  associated  software  routine  returns  the  array 
location  pointer  which  identifies  that  particular  image  portion  designated 
by  the  pen.   Usage  of  the  light  pen  for  this  investigation  is  discussed 
in  more  detail  in  Section  V. 
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IV.   PROBLEMS  ENCOUNTERED 

The  core  storage  limitations  of  the  XDS-9300  were  mentioned  in  Section 
III.  The  overriding  consideration  affecting  the  decision  to  utilize 
the  auxiliary  storage  available  for  the  XDS-9300  was  the  small  available 
core  storage  (8K)  of  the  graphics  display  computer.  Almost  half  of  the 
AGT-10  core  is  occupied  by  a  monitor  and  the  routines  necessary  to  inter- 
face with  the  digital  processor.   A  working  figure  of  4900  was  determined 
to  be  the  maximum  number  of  words  which  could  be  used  to  display  processed 
image  points  or  lines  on  the  CRT;  1900  of  these  words  are  required  to  dis- 
play information  resulting  from. the  application  routines  of  contour  tracing 
and  character  recognition.   A  single  point  displayed  on  the  CRT,  repre- 
senting one  black  point  on  the  transparency,  requires  two  words  in  memory; 
therefore  a  1500  point  display  restriction  was  imposed.   Since  even  a 
coarse  scan  of  100  X  100  samples  over  a  transparency  which  is  50  percent 
black  requires  a  possible  5000  display  points  it  is  easy  to  see  that  the 
graphics  computer  core  was  a  primary  limiting  factor  affecting  the  de- 
velopment of  an  interactive  system.   Figures  6  and  7  illustrate  the 
quantization  of  the  two  transparencies  used  for  the  applications  and  in 
both  cases  the  number  of  points  displayed  is  very  close  to  the  maximum 
allowed.   Despite  this  limitation  it  was  decided  that  sufficient  points 
could  be  displayed  on  the  CRT  to  achieve  the  general  objectives  of  the 
investigation. 

Of  secondary  importance  was  the  memory  available  within  the  XDS-9300 
where  image  processing  operations  were  performed.   One  method  of  image 
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(a)   Hand-drawn  contours  used  for  transparency 


(b)   CRT  display  of  quantized  contours 


FIGURE  6.   QUANTIZATION  OF  CONTOUR  PATTERN 
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(a)   Hand-drawn  characters  used  for  transparency 


(b)   CRT  display  of  quantized  characters 


FIGURE    7.      QUANTIZATION   OF   HAND-DRAWN   CHARACTERS 
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processing  requires  bulk  array  storage  of  X-Y  coordinates  and  transmitted 
light  values  prior  to  any  processing  of  the  scanned  image.   This  method, 
when  applied  to  the  100  X  100  sample.  50  percent  black  example  would 
fill  30K  words  in  main  storage  which  is  hardly  feasible  within  the  con- 
straints of  a  32K  machine.   This  assumes  real-type  variables  which  use 
two  words  each  in  memory.   Therefore  each  image  point  is  processed  during 
the  scanning  operation  and  scanning  is  terminated  when  1500  image  points 
have  been  identified.   This  procedure  precludes  the  necessity  for 
auxiliary  storage  yet  provides  sufficient  data  from  a  quantized  image 
for  interaction. 

The  length  of  time  an  operator  must  wait  between  the  decision  to  scan 
an  area  and  the  presentation  of  the  resultant  quantized  image  on  the  CRT 
display  can  be  considered  a  good  measurement  of  the  "interactiveness" 
of  the  system.   Many  parameters  affect  the  time  required  to  scan  a  given 
image  only  some  of  which  could  be  software  controllable.   The  primary 
hardware  parameters  are  core  storage  and  machine  cycle  time  both  of 
which  have  been  previously  discussed. 

Resolution  and  intensity  threshold  are  the  two  software  controllable 
parameters  which  have  the  greatest  affect  on  scanning  time.   In  general, 
increasing  or  decreasing  one  or  both  of  these  parameters  will  result  in 
an  increased  or  decreased  scanning  time  respectively.   Obviously  taking 
10,000  samples  from  an  image  is  not  going  to  take  as  long  as  taking 
20,000  samples  from  the  same  image,  but  there  is  not  a  linear  relation 
between  resolution  and  time.   The  reason  is  that  while  a  scan  at  one  value 
of  resolution  might  "find"  50  percent  of  the  actual  image  point  on  the 
transparency,  taking  twice  as  many  samples  (increased  resolution)  might 
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not  result  in  finding  twice  as  many  image  points.   Therefore  the  scan 
time  required  for  the  second  case  will  not  be  twice  as  long  as  the  time 
for  the  first.   Intensity  threshold  also  affects  scanning  time  but  to  a 
lesser  degree  than  resolution.   Since  all  points  which  meet  the  operator 
selected  intensity  threshold  are  processed,  a  change  in  this  threshold 
will  usually  result  in  a  change  in  the  number  of  image  points  which  meet 
the  criterion.   Therefore  since  processing  time  is  a  component  of  scanning 
time  any  change  in  the  former  is  directly  reflected  in  the  latter. 

Various  programing  techniques  available  within  Fortran  IV  and 
XDS-9300  assembly  languages  were  used  to  determine  the  minimum  processing 
time  required  to  complete  a  scanning  pattern  exclusive  of  the  resolution 
and  intensity  threshold  parameters.   Particular  attention  was  paid  to  the 
arithmetic,  memory  access,  and  looping  instructions  within  the  scanning 
sub-prcgram.   The  resultant  program  is  felt  to  require  a  minimum  of 
processing  time  to  perform  the  scanning  function. 

Selection  of  the  intensity  threshold  was  initially  performed  by  trial 
and  error  until  an  acceptable  value  was  obtained.   This  procedure  usually 
necessitated  many  repeated  scans  over  the  same  area  of  a  transparency. 
It  was  determined  that  this  trial  and  error  selection  could  be  greatly 
simplified  by  instituting  two  operator  controlled  functions  resulting 
in  a  "best  guess"  or  "ball  park"  value  for  the  threshold  which  can  then 
be  further  adjusted  by  the  previously  mentioned  variable  control  dial. 
One  method  scans  a  selected  area  of  the  transparency  and  computes  the 
average  transmitted  light  intensity;  the  other  method  computes  the  mini- 
mum intensity.   Both  methods  then  use  the  new  intensity  value  as  the 
"zero  setting"  on  the  control  dial  which  can  then  be  adjusted  up  or  down. 
A  very  coarse  scan  is  employed  to  minimize  the  time  factor  and  in  practice 

either  method  works  quite  satisfactorily. 
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The  particular  flying  spot  scanner  used  for  this  investigation,  having 
been  built  for  laboratory  experimentation  rather  than  for  commercial 
applications,  requires  periodic  minor  adjustments  by  the  operator. 
However,  these  adjustments  are  of  the  "front  panel"  type  and  are  not 
considered  to  be  significantly  difficult  in  a  laboratory  environment. 
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V.   SYSTEM  APPLICATIONS 

A.   CONTOUR  TRACING 

Two  commonly  used  methods  of  reducing  a  quantized  image  to  "single 
line"  form  are  (1)  thinning,  and  (2)  edge  following  or  tracing.   This 
discussion  is  restricted  to  images  which  present  a  closed  line  contin- 
uous contour  and  when  quantized  yield  an  N-wide  pattern  of  independent 
points  (N  is  the  average  width  of  a  line) .   Figure  8  shows  the  model 
which  is  applicable  to  this  discussion. 

The  process  of  thinning  reduces  an  N-wide  pattern  of  points  into  a 
1-wide  image.   Briefly,  the  algorithm  for  this  process  examines  each 
point  and  its  relationship  with  its  immediate  neighbors;  only  those 
points  necessary  to  completely  describe  the  image  boundaries  are 
retained  [9].   As  this  procedure  is  not  easily  adapted  to  an  interactive 
system  the  tracing  method  was  chosen. 

The  objective  of  a  contour  tracing  routine  using  a  flying  spot  scan- 
ner is  simply  to  scan  an  image  only  along  its  outside  perimeter  resulting 
in  a  continuous  line  contour  as  shown  in  Figure  9.   If  this  tracing  can 
be  displayed  to  the  operator  as  the  scanning  progresses  around  the  con- 
tour then  operator-scanner  interaction  can  be  acheived.   The  scanning 
pattern  now  becomes  that  path  within  the  N-wide  image  which  defines  the 
outer  boundary.   An  algorithm  was  developed  using  the  models  shown  in 
Figure  10. 

The  center  of  Figure  10(a)  represents  a  known  point  on  the  boundary 
of  a  contour.   The  cardinal  pointers  from  this  point,  numbered  1  through 
8,  represent  direction  of  movement  (DOM) ;  the  DOM  serves  as  the  basis 
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FIGURE  8.   CONTOUR  TRACING  MODEL 


FIGURE  9..  COMPLETED  TRACINGS  OF  CONTOURS  FROM  FIGURE  6 
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(a)  QX  (b) 

FIGURE  10.   MODEL  FOR  TRACING  ALGORITHM 

for  the  algorithm.   It  is  easily  seen  that  the  DOMs  point  to  the  eight 
neighbors  of  the  center  point,  or  "point  under  examination".   A  selected 
DOM, .together  with  the  pointers  on  either  side  of  the  DOM  form  a  "DOM 
triple"  which  is  used  to  locate  the  next  point  on  the  boundary. 

Using  Figure  10(b)  as  an  example,  the  start  point  is  the  point  at  the 
upper  left  corner  of  the  quantized  image  (this  pattern  is  typical  of  part 
of  a  4-wide  contour  line).   To  start  the  boundary  tracing  an  assumption  must 
be  made  as  to  the  initial  DOM.   In  the  example  of  Figure  10(b)  it  is  re- 
sonable  to  assume  an  initial  DOM  of  7.   To  find  the  next  point  on  the 
boundary  the  DOM  triple  is  applied  to  sample  three  neighbors  of  the  last 
known  point.   These  three  points  are  examined  in  counter-clockwise 
order.  The  first  black  point  found  is  considered  a  new  boundary  point, 
the  DOM  from  the  previous  point  is  computed,  and  the  procedure  iterates 
until  the  contour  is  defined. 

The  connected  points  in  Figure  10(b)  are  the  result  of  the  application 

of  this  algorithm.   The  DOM  progression  would  be:   7  (assumed),  8,  8,  8, 

1,  1,  8,  1,  8,  1,  1,  1,  1,  2,  1,  2,  3,  3,  3.   For  the  start  point  the 

DOM  triple  is  6,  7,  and  8  which  defines  the  three  possible  "next  points" 

of  (x-dx,  y-dy) ,  (x,  y-dy) ,  and  (x+dx,  y-dy) ;  the  coordinates  of  the  last 

known  boundary  point  are  (x,  y) . 
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As  the  tracing  progresses  the  primary  potential  error  condition  occurs 
when  a  "new"  black  point  is  not  found  from  examination  of  the  "next  points" 
defined  by  the  DOM  triple.   Figure  11  illustrates  this  condition  which  can 
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FIGURE  11.   MODEL  FOR  LOST  CONTACT  SEARCH 

happen  for  a  variety  of  reasons  including  a  rapid  change  in  the  incremen- 
tal slope  of  the  contour,  selection  of  a  coarse  dx  and  dy ,  or  an  insufficient 
intensity  threshold  setting.   At  point  (lj   the  DOM  =  7  produces  an  all- 
white  triple  which  will  not  "find"  the  next  point  on  the  boundary;  the 
same  situation  exists  at  point(2j  with  a  DOM  =  1.   Therefore,  provisions 
for  a  "lost  contact  search"  must  be  included  in  the  tracing  algorithm. 

A  simplified  "one  time"  search  was  deemed  to  be  appropriate  for  this  . 
investigation  and  requires  only  a  change  in  the  last  known  DOM.   When  a 
DOM  triple  fails  to  produce  the  next  point  then  a  new  DOM  is  assumed  by 
rotating  the  old  DOM  90  degrees  CCW  thereby  defining  a  new  DOM  triple. 
Should  this  new  DOM  also  fail  to  produce  then  an  additional  180 
degree  rotation  is  made;   further  search  is  not  attempted.   These  two 
DOM  rotations  along  with  the  original  DOM  result  in  checking  7  of  the 
8  neighbors  of  the  last  known  point.   The  8th  neighbor  is  already  known 
to  be  on  the  contour.   This  error  condition  procedure  is  used  at  points 
(l)   and  [2j   on  the  partial  contour  of  Figure  11  to  produce  the  connected 
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points.   Additional  search  methods  could  be  employed  in  the  event  the 
DOM  rotation  method  failed,  such  as  increasing  the  intensity  threshold, 
increasing  or  decreasing  dx  and  dy,  or  other  similar  techniques. 

The  contour  tracing  routine  is  intiated  by  a  function  switch;  the 
scanner  searches  across  the  transparency  from  an  operator-designated 
starting  point  until  a  contour  boundary  is  found,  at  which  point  the 
tracing  iterations  begin.   Each  point  found  is  immediately  displayed  to 
the  operator  on  the  CRT  with  line  segments  connecting  the  points.   As 
the  tracing  progresses  around  the  contour  the  operator  can  interact  with 
the  scanner  by  stopping  the  scan,  designating  the  contour  as  a  "completed 
contour"  and  storing  it  into  memory,  correcting  what  is  felt  to  be  a  tracing 
error,  or  combinations  of  these  functions. 

Figure  12(a)  shows  an  error  which  has  occurred  during  an  otherwise 
normal  contour  tracing.   Because  of  the  proximity  of  the  contour  being 
traced  to  another,  the  scanner  has  "jumped"  to  the  outer  contour  and 
continued  tracing  along  an  undesired  contour.   Discovering  that  an 
erroneous  tracing  is  occurring  the  operator  can  stop  the  scan  and 
then  by  using  the  light  pen  designate  the  point  at  which  the  error  was 
made.   Now  by  restarting  the  tracing  routine  at  a  point  further  along 
the  desired  contour  from  the  error  point  a  straight-line  connection  is 
displayed  between  the  two  disjointed  contour  lines,  the  erroneous  por- 
tion of  the  tracing  is  erased  and  tracing  proceeds  normally.   This 
corrective  technique  is  applied  to  the  tracing  shown  in  Figure  12(a) 
resulting  in  Figure  12(b).   Figure  13  illustrates  two  other  common  error 
conditions  which  can  be  corrected  by  light  pen.   The  ability  of  the 
operator  to  control  the  scanner  while  tracing  a  contour  was  considered 

to  be  significant  for  the  demonstration  of  an  interactive  optical 

scanning  system. 
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(b)   Corrected  tracinj 


FIGURE  12.   ERROR  CORRECTION  OF  ERRONEOUS  TRACING 
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error  point         possible  e^°£  possible 

restart  point  poin  restart  point 


FIGURE  13.   OTHER  POSSIBLE  TRACING  ERRORS 

Consideration  was  given  to  modifications  to  the  tracing  algorithm 
which  would  compute  the  point  at  which  the  contour  was  completed, 
"smooth"  the  finished  contour  by  using  sophisticated  approximation  tech- 
inques,  and  to  include  more  rigorous  searching  procedures  in  the  event 
of  "lost  contact".   However,  it  was  determined  that  these  modifications 
would  in  general  greatly  increase  computation  time  and  thereby  signifi- 
cantly reduce  the  desired  degree  of  interaction.   The  objective  of 
demonstrating  interactive  optical  scanning  within  the  constraints  of 
the  laboratory  equipment  had  been  fulfilled  and  it  was  decided  that 
any  further  development  of  the  interactive  tracing  technique  was  beyond 
the  scope  of  the  investigation  and  not  within  the  capability  of  available 
equipment . 

B.   TEMPLATE  MATCHING 

The  basic  idea  of  template  matching  was  discussed  in  Section  II. 
This  technique  was  one  of  the  first  used  in  early  character  recognition 
systems.   Template  matching  was  chosen  as  the  second  application  showing 
that  an  interactive  system  can  be  useful  in  image  processing. 

Character  recognition  by  template  matching  can  best  be  described 
by  a  direct  comparison  with  a  type  of  children's  toy  where  the  child 
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must  fit  circular,  square,  triangular,  and  other  shaped  blocks  into  their 
respective  slots  in  the  board.   The  slots  then  are  templates  for  the  various 
shapes  of  blocks  and  while  a  circular  block  might  fit  ro  lghly  into  the 
square  slot  it  will  fit  best  into  the  circular  slot.   While  the  procedure 
for  the  toy  is  simplistic  the  analogous  character  recognition  technique 
is  considerably  more  complicated. 

In  order  to  employ  the  interactive  scanning  system  in  a  character 
recognition  routine  it  was  necessary  to  narrowly  define  the  objective  of 
this  application.   Little  consideration  was  given  to  scanning  and  proces- 
sing times,  multifont  characters,  or  techniques  of  registering  the 
quantized  character  with  the  template.   After  scanning  and  displaying 
a  quantized  character  on  the  graphics  CRT,  the  operator  can  direct  the 
system  to  "recognize"  the  selected  character  and  then  to  display  the 
answer  to  the  operator.   The  objective  then  became  the  demonstration 
of  a  simple,  interactive  character  recognition  routine. 

A  template  matching  algorithm  was  developed  by  using  a  24  X  24 
sample  raster  scan.   The  sample  size  was  chosen  to  conform  to  the  24-bit 
word  size  of  the  XDS-9300  computer  so  that  upon  completion  of  the  scan 
an  "image  array"  of  24  words  containing  the  576  samples  from  the  char- 
acter is  stored  in  memory.   The  operator  selects  a  character  for  recog- 
nition by  positioning  the  "scan  box"  so  as  to  enclose  the  character  and 
then  directs  the  system  to  "recognize".   At  each  sample  point  a  0  or  1 
is  added  to  the  word  in  the  accumulator  of  the  XDS-9300  depending  on 
whether  the  point  is  white  or  black  respectively;  the  word  in  the  ac- 
cumulator is  then  shifted  left  one  bit  and  a  new  sample  is  analyzed. 
A  new  word  is  constructed  in  the  accumulator  for  each  24  samples  until 
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the  scan  is  completed.   This  24  X  24  image  array  of  O's  and  l's  now 
defines  the  quantized  image  and  is  then  compared  to  the  templates  in 
memory.   A  template  is  another  24  X  24  bit  array  for  each  recognizable 
character  with  l's  only  where  l's  are  expected  in  a  sample  of  this  character 
The  templates  were  constructed  to  conform  to  the  general  type  of  font 
shown  in  Figure  7 . 

The  576-bit  image  array  is  compared  bit  far -bit  with  each  of  the 
576-bit  template  arrays  by  using  the  logical  AND  operation.   Each  result- 
ing 24  X  24  array  then  contains  l's  at  those  bit  positions  where  a  match 
occurs  between  an  image  bit  and  a  template  bit  (both  are  l's),  and  O's 
where  there  is  no  match.   Counting  the  number  of  l's  in  this  array  yields 
the  number  of  match  points  between  the  image  array  and  a  particular 
template.   Figure  14  shows  two  typical  quantized  characters  with  their 
corresponding  templates. 

The  recognition  decision  is  computed  by  two  different  methods.   The 
"brute  force"  method  simply  uses  that  template  which  produces  the  greatest 
number  of  match  points  (most  matched)  and  displays  on  the  CRT  to  the 
operator  the  corresponding  character.   The  second  method  (best  percentage) 
involves  computing  a  percentage  P  for  each  image-template  comparison  by 
using 

(no.  of  match  points)        (no.  of  match  points) 

(no.  of  l's  in  template)      (no.  of  l's  in  image  array) 

Decision  and  display  is  now  made  as  before  using  that  template  which 
yields  the  largest  value  of  P.   In  practice  both  methods  are  computed 
during  a  single  scanning  operation. 
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FIGURE  14.   QUANTIZED  CHARACTERS  WITH  IDEAL  TEMPLATE  MATCH 
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The  template  matching  routine  was  tested  using  the  16  characters 
shown  in  Figure  7.   Figures  15  and  16  show  typical  results  as  displayed 
to  the  operator  on  the  graphics  terminal  CRT.  Tables  I  and  II  both 
illustrate  the  "confusion"  factor  associated  with  each  recognition  method 
and  also  demonstrate  the  greater  accuracy  of  the  best  percentage  method. 
A  "confusion  table"  shows  the  results  of  a  number  of  recognition  tests 
of  a  certain  character  and  a  measure  of  the  system's  recognition  capability 
can  be  derived  from  the  total  number  of  answers  "off  the  diagonal".   For 
example,  in  Table  I,  of  10  trials  on  the  letter  "Y",  9  trials  resulted  in 
the  correct  answer  "Y"  and  1  trial  returned  a  "T".   By  dividing  the  total 
number  of  correct  answers  by  the  total  number  of  trials  for  each  method 
it  can  be  seen  that  the  results  of  the  tests  contained  in  Tables  I  and 
II  yield  83.1  percent  correct  for  the  most  matched  method  and  92.5  percent 
for  the  best  percentage  method. 
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FICxURE    15.      DISPLAY   OF  TEMPLATE   MATCHING  RESULTS- -BOTH   ANSWERS    CORRECT 


FIGURE    lb.      DTSPLAY   OF   TEMPfATE   MATCH  TNG  RESULTS- -ANSWERS    DIFFERENT 
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SCANNED   LETTER 
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*Third  row  "F"  in  Figure  7 


TABLE  I.   CONFUSION  TABLE  FOR  MOST  MATCHED  METHOD 
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Scanned  Letter 
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M 

1 

10 

Y 

10 

*Third  row  "F"  in  Figure  7 


TABLE  II.   CONFUSION  TABLE  FOR  BEST  PERCENTAGE  METHOD 
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VI.   SUMMARY  AND  CONCLUSIONS 

The  desirability  of  optical  to  digital  data  conversion  was  briefly 
discussed  in  Section  I.   A  device  to  accomplish  this  conversion,  the 
flying  spot  scanner,  was  introduced  as  the  primary  building  block  around 
which  an  interactive  scanning  system  was  developed.   The  system  contraints 
and  problems  encountered  were  described  together  with  the  solutions  used 
to  surmount  these  difficulties.   The  interactive  system  was  demonstrated 
in  applications  to  the  problems  of  contour  tracing  and  template  matching 
and  the  results  of  these  applications  were  briefly  discussed. 

It  is  concluded  that  interactive  optical  scanning  is  an  effective 
means  of  investigating  pattern  recognition  techniques.   In  particular, 
the  technique  of  interactive  contour  tracing  could  be  effectively  applied 
to  many  areas  utilizing  graphical  data  some  of  which  were  discussed  in 
Section  I. 

It  is  recommended  that  further  experiments  be  performed  with  the 
flying  spot  scanner  used  in  this  investigation  as  an  optical  input 
device  for  such  applications  as  gray-level  quantization,  machine  reading 
of  text,  and  biomedical  pattern  recognition. 

Techniques  of  gray-level  quantization  could  be  applied  to  the  out- 
put of  the  scanner  so  that  more  practical  applications  can  be  investigated 
By  using  the  XDS-9300  auxiliary  drum  storage  enough  samples  can  be  proces- 
sed to  make  gray-level  quantization  feasible  within  the  system  constraints 
and  by  judicious  data  handling  the  concept  of  the  interactive  system 
could  be  retained  and  even  expanded.   A  more  thorough  investigation  of 
the  uses  of  the  reference  light  signal  voltage  and  digital  control  of 
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the  Z-signal  voltage  (CRT  spot  intensity)  would  be  helpful  in  this  application 
To  facilitate  image  processing  investigations  an  additional  output  device 
to  produce  "hard-copies"  is  necessary;  the  presently  available  devices 
are  not  well  suited  for  this  purpose. 

Practical  application  of  the  contour  tracing  routine  can  be  expanded 
to  include  bulk  storage  and  complete  processing  of  many  contours,  again 
retaining  system  interaction.   Such  a  system  could  be  very  useful  in  the 
analysis  of  oceanographical  and  meteorological  contour  charts  which  are 
readily  available.   Drum  storage  and  more  sophisticated  data  processing 
techniques  would  aid  in  this  type  of  investigation. 

Character  recognition  routines,  of  which  there  are  many,  could  be 
tested  and  evaluated  using  the  scanner  as  the  primary  input  device.   A 
necessary  prerequisite  to  this  type  of  application  would  be  a  routine 
for  automatic  registration  of  characters  during  or  after  the  scanning 
operation.   When  acceptable  .alpha-numeric  recognition  algorithms  have 
been  tested,  techniques  for  complete  optical  input  of  textual  data  could 
be  investigated. 

The  potential  of  image  processing  as  applied  to  the  biomedical 
field  was  briefly  discussed  in  Section  I.   Both  contour  tracing  and  gray- 
level  quantization  could  be  effectively  employed  in  the  problem  of 
cellular  analysis  by  pattern  recognition.   A  good  example  is  the  distinc- 
tive shape  of  the  sickle  cell.   The  scanner  used  in  this  thesis  cannot 
easily  handle  batched  inputs  for  large-volume  production  runs,  but  it 
can  certainly  be  used  for  recognition  of  biomedical  data.   This  is  another 
application  area  where  research  can  be  significantly  aided  by  the  use  of 
interactive  techniques. 
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